Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

نویسندگان

Zhiguang Cao

Hongliang Guo

Jie Zhang

Frans A. Oliehoek

Ulrich Fastenrath

چکیده

The stochastic shortest path problem is of crucial importance for the development of sustainable transportation systems. Existing methods based on the probability tail model seek for the path that maximizes the probability of arriving at the destination before a deadline. However, they suffer from low accuracy and/or high computational cost. We design a novel Q-learning method where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve accuracy. By further adopting dynamic neural networks to learn the value function, our method can scale well to large road networks with arbitrary deadlines. Experimental results on real road networks demonstrate the significant advantages of our method over other counterparts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طراحی پایدارساز PSS3B بر اساس الگوریتم KH و Q-learning برای میراسازی نوسانات فرکانس پایین سیستم قدرت تک‌ماشینه

The main purpose of this paper is to develop a supplementary signal using reinforcement learning (RL) to improve the performance of power system stabilizer (PSS). RL is one of the most important issues in the field of artificial intelligence and is the popular method for solving Markov decision procedure (MDP). In this paper, a control method is developed based on Q-learning and used to improve...

متن کامل

A Q-learning Based Continuous Tuning of Fuzzy Wall Tracking

A simple easy to implement algorithm is proposed to address wall tracking task of an autonomous robot. The robot should navigate in unknown environments, find the nearest wall, and track it solely based on locally sensed data. The proposed method benefits from coupling fuzzy logic and Q-learning to meet requirements of autonomous navigations. Fuzzy if-then rules provide a reliable decision maki...

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

Mini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism

This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...

متن کامل

Transient Analysis of M/M/R Machining System with Mixed Standbys, Switching Failures, Balking, Reneging and Additional Removable Repairmen

The objective of this paper is to study the M/M/R machine repair queueing system with mixed standbys. The life-time and repair time of units are assumed to be exponentially distributed. Failed units are repaired on FCFS basis. The standbys have switching failure probability q (0≤q≤1). The repair facility of the system consists of R permanent as well as r additional removable repairmen. Due to i...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

نویسندگان

چکیده

منابع مشابه

طراحی پایدارساز PSS3B بر اساس الگوریتم KH و Q-learning برای میراسازی نوسانات فرکانس پایین سیستم قدرت تک‌ماشینه

A Q-learning Based Continuous Tuning of Fuzzy Wall Tracking

Evaluating project’s completion time with Q-learning

Mini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism

Transient Analysis of M/M/R Machining System with Mixed Standbys, Switching Failures, Balking, Reneging and Additional Removable Repairmen

عنوان ژورنال:

اشتراک گذاری